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Abstract — This paper presents a general stochastic model 
developed for a class of cooperative v^lreless relay networks, in 
which Imperfect knowledge of the channel state information at 
the destination node is assumed. The framework incorporates 
multiple relay nodes operating under general known non-linear 
processing functions. When a non-linear relay function is con- 
sidered, the likelihood function is generally Intractable resulting 
in the maximum likelihood and the maximum a posteriori 
detectors not admitting closed form solutions. We Illustrate our 
methodology to overcome this intractability under the example of 
a popular optimal non-linear relay function choice and demon- 
strate how our algorithms are capable of solving the previously 
intractable detection problem. Overcoming this Intractability 
involves development of specialised Bayesian models. We develop 
three novel algorithms to perform detection for this Bayesian 
model, these Include a Markov chain Monte Carlo Approximate 
Bayesian Computation (MCMC-ABC) approach; an Auxiliary 
Variable MCMC (MCMC-AV) approach; and a Suboptlmal 
Exhaustive Search Zero Forcing (SES-ZF) approach. Finally, 
numerical examples comparing the symbol error rate (SER) 
performance versus signal to noise ratio (SNR) of the three 
detection algorithms are studied in simulated examples. 



I. Background 

Cooperative communications systems, [1] and [2], have 
become a major focus for communications engineers. Particu- 
lar attention has been paid to the wireless network setting, 
as it incorporates spatial resources to gain diversity, and 
enhance connection capability and throughput. More recently, 
the focus has shifted towards incorporation of relay nodes [3], 
which are known to improve energy efficiency, and reduce the 
interference level of wireless channels [4]. 

In simple terms, such a system broadcasts a signal from 
a transmitter at the source through a wireless channel. The 
signal is then received by each relay node and a relay strategy 
is applied before the signal is retransmitted to the destination. 
A number of relay strategies have been studied in the literature 
([5]; [2]). We focus on the amplify-and-forward strategy of [6] 
in which the relay sends a transformed version (determined by 
the relay function) of its received signal to the destination. The 
relay function can be optimized for different design objectives 
([7]; [8]; [9]). It is common in the relay network literature 
to consider non-linear relay function choices which satisfy 
some concept of optimality. For example, in the estimate 
and forward (EF) scheme, in the case of BPSK signaling, 
the optimal relay function is the hyperbolic tangent [10]. 



Other criteria for which the optimal relay function is non- 
Unear include: capacity maximisation [10], minimum error 
probability at the receiver [11], SNR maximisation [9], rate 
maximisation [12] and minimisation of the average error 
probability [13]. 

Currently, in all these cited works the authors have solved 
the problem of what the optimal choice of the non-linear 
relay function should be in order to satisfy the desired design 
constraint. However, we note that in all cases, whenever a 
non-linear relay function is considered, though it may satisfy 
the constraint of optimality, it will result in an intractable 
detection problem. This intractability has not been considered 
in the literature and therefore it has not been possible to 
tackle the resulting detection problem. We specify explicitly 
how this intractability arises and then develop and demonstrate 
extensively our solution to this general relay network detection 
problem. In this paper we will demonstrate that the choice of 
relay function directly affects the tractability of the system 
model. We will focus on a single hop relay design in which 
the number of relays and the type of relay function are 
allowed to be known, general non-linear functions. However, 
our methodology trivially extends to arbitrary relay topologies 
and multiple hop networks. Fig. 1 presents the system model 
considered. Our model incorporates stochastic parameters that 
are associated with each relay channel link. That is, we 
consider parameters associated with the model as random 
variables, which must be jointly estimated with the unknown 
random vector of transmitted symbols. 

Since this framework incorporates general relay functions, 
the resulting likelihood for the model typically cannot be 
evaluated pointwise, and in some cases cannot even be written 
down in a closed form. Bayesian "likelihood-free" inference 
methodologies overcome the problems associated with the 
intractable likelihood by replacing explicit evaluation of the 
likelihood with simulation from the model. These methods are 
also collectively known as Approximate Bayesian Computa- 
tion (ABC), see [14], [15], [16], [17], [18], [19], and references 
therein for a detailed overview of the methodological and 
theoretical aspects. 

Applications of ABC methods are becoming widespread, 
and include: telecommunications [18]; extreme value theory 
[20]; protein networks, ([14], [21]); SIR models [22]; species 
migration [23]; pathogen transmission [24], non-life insurance 
claims reserving [16]; and operational risk [25]. 




Fig. 1. The system model studied in this paper involves transmission 
from one source, through each of the L relay channels, fe"', to 
the relay. Additive complex Gaussian noise, W was included at the 
receiver of the relay then the signal is processed and retransmitted by 
the relay to the destination. In this process the signal is transmitted 
again through L channels denoted by g"^ and additive complex 
Gaussian noise, V was included at the destination. 



A. Main contributions 

In this paper we develop a novel sampling methodol- 
ogy based on the Markov Chain Monte Carlo Approx- 
imate Bayesian Computation (MCMC-ABC) methodology 
[26]. Since the focus of this paper is on detection of the trans- 
mitted symbols, we will integrate out the nuisance channel 
parameters. We also develop two novel alternative solutions 
for comparison: an auxiliary variable MCMC (MCMC-AV) 
approach and a suboptimal explicit search zero forcing (SES- 
ZF) approach. In the MCMC-AV approach the addition of 
auxiliary variables results in closed form expressions for the 
full conditional posterior distributions. We use this fact to 
develop the MCMC-AV sampler and demonstrate that this 
works well when small numbers of relay nodes are considered. 
The SES-ZF approach involves an approximation based on 
known summary statistics of the channel model and an explicit 
exhaustive search over the parameter space of code words. 
This performs well for relatively small numbers of relays and 
a high signal to noise ratio (SNR). 

The paper is organised as follows: in Section II we introduce 
a stochastic system model for the wireless relay system and 
the associated Bayesian model. We discuss the intractability 
arising from these models when one considers arbitrary relay 
functions. Section III presents the likelihood-free methodology 
and the generic MCMC-ABC algorithm. In Section IV we 
present the proposed algorithm, namely maximum a posteriori 
sequence detection. In Section V we derive an alternative 
novel algorithm based on an auxiliary variable model for the 
joint problem of coherent detection and channel estimation 
for arbitrary relay functions. We contrast the performance of 
this approach with the performance of the MCMC-ABC based 
algorithm. Section VI presents a detector that is based on 
known summary statistics of the channel model and an explicit 
exhaustive search over the parameter space of code words. We 
shall demonstrate its performance, whilst acknowledging its 



flaws. Section VII presents results and analysis and conclu- 
sions are provided in Section VIII. 

The following notation is used throughout: random variables 
are denoted by upper case letters and their realizations by 
lower case letters. In addition, bold will be used to denote a 
vector or matrix quantity, upper subscripts will refer to the 
relay node index and lower subscripts refer to the element 
of a vector or matrix. We define the following notation that 
shall be used throughout, g ~ gi-^ — (g*-^-*, • • • ,3^^'); g'"' 
refers to the state of the Markov chain at iteration n; g^" 
refers to the i-th element of gl"'. We define the generic 
transition kernel for the Markov chain as q{9* -(— 6'["~^1), 
which updates the Markov chain for the posterior parameters, 
9, from the [n — l)-th Markov chain step to the n-th via the 
proposal q. 

II. Bayesian system model and MAP detection 

In this section we introduce the system model and a 
Bayesian model for inference on the system model parameters. 
In our system model, the channels in the relay network are 
modelled stochastically, where we do not know a priori the 
realized channel coefficient values. Instead, we consider partial 
channel state information (CSI), in which we assume known 
statistics of the distribution of the channel coefficients. 

A. System model and assumptions 

Here we present the system model and associated assump- 
tions. The system model is depicted in Fig. 1. 

1) Assume a wireless relay network with a single source 
node, transmitting sequences of K symbols denoted s == 
si:K, to a single destination via L relay nodes. 

2) The symbols in the sequence of K symbols s are taken 
from a digital constellation with cardinality M. 

3) There are L relays which cannot receive and transmit on 
the same time slot and on the same frequency band. We 
thus consider a half duplex system model in which the 
data transmission is divided into two steps. In the first 
step, the source node broadcasts a code word s G O 
from the codebook to all the L relay nodes. In the 
second step, the relay nodes then transmit their signals 
to the destination node on orthogonal non-interfering 
channels. We assume that all channels are independent 
with a coherence interval larger than the codeword 
length K. 

4) Assume imperfect CSI in which noisy estimates of the 
channel model coefficients for each relay link are known. 
This is a standard assumption based on the fact that a 
training phase has been performed a priori. This in- 
volves an assumption regarding the channel coefficients 
as follows: 

• From source to relay there are L i.i.d. channels pa- 

where 



rameterized by Ih'-'-^ r^ F Ch'''-\<Tl\\ 
F{-) is the distribution of the channel coefficients, 
h} ' is the estimated channels coefficient and ct^ is 
the associated estimation error variance. 



• From relay to destination there are L i.i.d. channels 
parameterized by |G^" ^ F(g''\cr^)} , where 
F{-) is the distribution of the channel coefficients, 
Tj^^' is the estimated channels coefficient and cr^ is 
the associated estimation error variance. 
5) The received signal at the l-th relay is a random vector 
given by 



r(0 = sij(') + W('\ / G {1, ■ • ■ , L} , 



(1) 



where H^^> is the channel coefficient between the trans- 
mitter and the l-th relay, S G ^m is the transmitted 
code-word and W^'' is the noise realization associated 
with the relay receiver 
6) The received signal at the destination is a random vector 
given by 

YW^f(0(R(0)G(0+v«, /e{l,---,i}, (2) 



where G*^" is the channel coefficient between the l-th 
relay and the receiver, 

f« (rW) A \^j(i) (^^(i)'^ ^ _ _ ^ j(0 (4^)] ^ is the mem- 

oryless relay processing function (with possibly different 
functions at each of the relays) and V^'-* is the noise 
realization associated with the relay receiver. 
7) Conditional on {hy'\g^'\ ,_,, we have that all re- 
ceived signals are corrupted by zero-mean additive white 
complex Gaussian noise. At the l-th relay the noise 
corresponding to the l-th transmitted symbol is denoted 
by random variable W^ ^ CK (O, cr^j. At the receiver 
this is denoted by random variable V,-^ ^ CK (0,(7^). 
Additionally, we assume the following properties: 



E 



w,(')T^f 



= E 









yi,j eJl,...,K},\fl,m e {l,...,L},i ^ j,l ^ m, 
where Wj denotes the complex conjugate ofWj. 

B. Prior specification and posterior 

Here we present the relevant aspects of the Bayesian model 
and associated assumptions. In order to construct a Bayesian 
model we need to specify the likelihood p (y|s, h, g), and the 
priors for the parameters s,h,g which are combined under 
Bayes theorem to obtain a posterior p(s, g, h|y). We begin 
by specifying the prior models for the sequence of symbols 
and the unknown channel coefficients. At this stage we note 
that in terms of system capacity it is only beneficial to transmit 
a sequence of symbols if it aids detection. This is achieved by 
having correlation in the transmitted symbol sequence si-k- 
We assume that since this is part of the system design, the 
prior structure for p{si-k) will be known and reflect this 
information. 

1) Under the Bayesian model, the symbol sequence is 
treated as a random vector S = Si;K- The prior 
for the random symbols sequence (code word) Si:k is 
defined on a discrete support denoted f2 with \Q\ — 
M^ elements and probability mass function denoted by 
p{si:k)- 



2) The assumption of imperfect CSI is treated under a 
Bayesian paradigm by formulating priors for the chan- 
nel coefficients as follows: 

• From source to relay there are L i.i.d. channels pa- 
rameterized by I i/('' ^ CK ( /i*^'', cr^ j [ , where 

hS' is the estimated channels coefficient and af^ the 
associated estimation error variance. 

• From relay to destination there are L i.i.d. channels 
parameterized by JG^'-* ^ 6^ (s'''^ ^V) };-i' where 
g^^' is the restituted channels coefficient and <j^ the 
associated estimation error variance. 



C. Inference and MAP sequence detection 

Since the primary concern in designing a relay network 
system is on SER versus SNR, our goal is oriented towards 
detection of transmitted symbols and not the associated prob- 
lem of channel estimation. We will focus on an approach 
which samples Si;KtG — Gi:l,H — Hil jointly from the 
target posterior distribution, since our model also considers 
the channels to be stochastic and unknown. 

In particular we consider the maximum a posteriori (MAP) 
sequence detector at the destination node. Therefore the goal 
is to design a MAP detection scheme for s at the destination, 
based on the received signals {y^'''};_i5 the noisy channels 

(T^, (Tg, and the noise variances, a'^ and cr^. 

Since the channels are mutually independent, the received 

signals {r^'''}, , and {y'''^}, , are conditionally independent 
given s, g, h. Thus, the MAP decision rule, after marginalizing 
out the unknown channel coefficients, is given by 



— 0, s = argmaxp (s|y) 



argmaxj^ / / p f s,s(*''><''|y<'' j dhdg 



= argmaxj^ p (^y<''|s, /i''',p''' j p(s)p (g)p(h) dhdg. 

(3) 

Therefore, in order to perform detection of the transmitted 
symbols, we need to evaluate the likelihood function of the 
model. In the next section we will demonstrate that the like- 
lihood function in our model is intractable and as a result we 
develop the likelihood-free based methodology and associated 
MCMC-ABC algorithm to perform the MAP detection. 

D. Evaluation of the likelihood function 

The likelihood model p (y^'^js, ft,('\ g*^'') for this relay 
system is in general computationally intractable as we now 
show. There are two potential difficulties that arise when 
dealing with non-linear relay functions. The first relates to 
finding the distribution of the signal transmitted from each 
relay to the destination. This involves finding the density of 
the random vector f(') (Si/^'' + W^')) G^'^ conditional on 
realizations S = s,H = h, G = g. This is not always possible 
for a general non-linear multivariate function f ^'^ . Conditional 



on S = s, H = h, G = g, we know the distribution of 

R(')|s,g,h, 

PR(r|s,g,h) 
= p (s/jW + w(')|s, /jW, g(')) - eW {sh^'\<yll) . 

However, finding the distribution of the random vector after the 
non-linear function is apphed i.e. the distribution of f (R^'') = 
f (R(')) G*^'-' given s, h^^\g^^\ involves the following change 
of variable formula 



p(f(rW) Is, /.('),/)) 

Jf«)-i (rW)|s, /.('), gW) 



af(0 



9rW 



which can not always be written down analytically for arbitrary 
f . The second more serious complication is that even in cases 
where the density for the transmitted signal is known, one must 
then solve a iiT-fold convolution to obtain the likelihood: 



.(0 



.(0 



V ( y^^ |s, g, h ) = p [i [r^'^j |s, g, hj * pv(0 

p (^f (z|s, g, h) j pv(o (y^'' - zj dzi . . . dzK, 

(4) 

where * denotes the convolution operation. Typically this 
will be intractable to evaluate pointwise. However, in the most 
simplistic case of a linear relay function, that is, f ('-' [x) = x, 
the likelihood can be obtained analytically as 



p(y«|s,;i«,.g('))=eK('s/iW5W, 



(0 



where I is the identity matrix. 

Hence, the resulting posterior distribution in (3) involves 
combining the likelihood given in (4) with the priors for the 
sequence of symbols and the channel coefficients. 

III. Likelihood-free methodology 

In this section, we provide a concise background intro- 
duction on likelihood-free methodology, also known as ap- 
proximate Bayesian computation, and the sampling algorithms 
utilised to obtain samples from the ABC posterior model, 
based on [18], [22], [26], [17]. Likelihood-free inference 
describes a suite of methods developed specifically for work- 
ing with models in which the likelihood is computationally 
intractable. We consider the likelihood intractability to arise in 
the sense that we may not evaluate the likelihood pointwise, 
such as in (4). In particular, for the relay models considered, 
we can only obtain a general expression for the likelihood 
in terms of multivariate convolution integrals and hence we 
do not have an explicit closed form expression for which to 
evaluate the likelihood pointwise. 
ABC Methodology and MCMC-ABC: 
All ABC methodologies are based upon the observation that 
one may replace the problem of evaluating the intractable like- 
lihood point-wise with the typically trivial exercise of simulat- 
ing from the likelihood model. Simulation from the likelihood 
usually involves trivial generation of random variables from 



a known parametric distribution, followed by application of a 
transformation corresponding to the imposed physical model. 
It is shown in [27] and further developed in [18] that 
the ABC method embeds an "intractable" target posterior 
distribution (resulting from the intractability of the likelihood 
function (4)), denoted by p (s|y), into an augmented posterior 
model, 

p (s, x|y) oc p (y|x, s; e) p (x|s) p (s) , 

where x G X is an auxiliary vector on the same space as 
observations y. In this augmented Bayesian model, the density 
p(y|x, s;e) acts as a weight for the intractable posterior. It 
compares the observations y with the auxiliary ("synthetic 
data" x) generated from the likelihood model, and determines 
whether or not they are within an e tolerance of each other. 
Choices for this weighting, p (y|x, s; e), will be explained 
below. For detailed justification for this framework see [18], 
[16] and [26]. In this paper we consider marginal sampler. 
Summarising the ideas in these papers, we note that the target 
marginal posterior p(s|y) can be represented in the ABC 
framework as follows 



p(s|y) cxp(s)p(y|s) 

==p(s) / P(y|x,s;e)p(x|s)dx 
= P (s) IEp(x|s) b(y|x, s; e)] 
-^('^.f]My|x^s;e):^p(s|y). 



(5) 



D 



d=l 



Presenting the ABC approximate posterior in this way allows 
us to provide intuition for the ABC mechanism. To understand 
this, we note that the Monte Carlo approximation, denoted by 
p{s\y), of the integral with respect to the intractable likeli- 
hood, is approximated via D draws of auxiliary variables x'' 
("synthetic data") from p(x|s). Therefore, this demonstrates 
explicitly how one can replace the evaluation of the likelihood 
with simulation from the likelihood model. As discussed in 
[16], [17], the choice of weighting function can affect the 
Monte Carlo estimate. The function p(y|x,s; e) is typically a 
standard smoothing kernel [28] with scale parameter e which 
weights the intractable posterior with high values in regions 
when the observed data y and auxiliary date x are similar For 
example, uniform kernels are commonplace in likelihood-free 
models (e.g. [26]), although alternatives such as Epanechnikov 
[15] and Gaussian kernels [17] provide improved efficiency. 

In this paper we consider two popular choices for weighting 
functions, p(y|x, s;e), which are easily interpreted as hard 
and soft decision functions [17]. The popular "Hard Decision" 
(HD) is given by 



P(y|x,s;e) oc 



1, ifp(y,x)<e, 
0, otherwise. 



(6) 



This makes a hard decision to reward those simulated auxiliary 
variables, x, that are within an e-tolerance of the actual 
observed data, y, as measured by distance metric p. Another 
example is a "Soft Decision" (SD) weighting function given 



by 



P(y|x,s;e) ocexp ( — 



(7) 



An important practical difference between the HD and SD 
kernels is that, even though the weighting of the intractable 
posterior, obtained by the SD density, may be small, it will 
remain non-zero unlike the HD rule. 

The choice of e is therefore important for performances of 
the ABC methodology. If e is too large, the approximation 
p (s|y) is poor; for example when e — ;> oo the resulting ABC 
posterior p{s\y) in (5) corresponds to the prior since the 
weighting function is uniform for all values of the posterior 
parameters. However, if e is sufficiently small, p(s|y) is a 
good approximation of p (s|y). We note that there is no 
approximation when 6 = 0. From a practical computational 
cost perspective, we will discuss why it is not feasible to set 
e = and instead we must settle for selecting an e > via 
a trade-off between accuracy and computation cost. There are 
several guidelines to specify a sequence of tolerances that one 
may follow to achieve a given e, see [29], [17], [21]. 

The following two sections provide specific background 
detail related to specifications of the weighting functions. We 
then finish this section by presenting the generic MCMC-ABC 
algorithm used to obtain samples from the marginal posterior 
p (s|y) in order to solve the MAP detection problem in (3). 

A. Data Summaries 

We note that when the total number of observations is large, 
then comparing directly the simulated data x to the observed 
data y in the weighting function is not computationally effi- 
cient. Therefore, one considers dimension reduction. The stan- 
dard approach to dimension reduction is to compare summary 
statistics, T(y), which should summarise the information 
present in the data. The summary statistic T (y) is called a 
sufficient statistic for y if and only if p (s|y) oc p (s|T (y)) , 
i.e., according to the Neayman-Fisher factorisation theorem, 
we can replace the conditional distribution of y given the 
parameters with the conditional distribution summary statistics 
given the parameters and a constant with respect to the 
parameters. If a sufficient statistic is available, then using 
the summary statistics is essentially the same as using the 
whole data. Therefore, as e ^ 0, the ABC approximation 
p(s|T (y)) -^ p(s|y)). However, in most real practical models, 
sufficient statistics are unknown and one must make alternative 
choices. 

B. Distance Metrics 

Having obtained summary statistic vectors T (y) and T (x), 
likelihood-free methodology then measures the distance be- 
tween these vectors using a distance metric, denoted generi- 
cally by p (T (y) , T (x)). The most popular example involves 
the basic Euclidean distance metric which sums up the squared 
error between each summary statistic as follows: 



p(T(y),T(x)) 



dim(T) 

^ (T,(y)-T,(x)f . 



Recently more advanced choices have been proposed and their 
impact on the methodology has been assessed [17]. These 
include: scaled Euclidean distance given by the square root 
of 

dim(T) 

p (T (y) , T (x)) - X^ A. (T, (y) - T, (x))" ; (9) 



Mahalanobis distance: 

p (T (y) , T (x)) = (T (y) - T (x)) ^-^ (T (y) - T (x))^ ; (10) 
L'P norm: 



dim(T) 



Pll/P . 



p(T(y),T(x))= Y. [|T,(y)-T,(x)n^-; (11) 

i 

and city block distance: 

dim(T) 

" (12) 



dim(T) 

p(T(y),T(x))= Y. |T,(y)-T,(x) 



In particular, we note that distance metrics which include 
information regarding correlation between elements of each 
summary statistic vector, result in improved estimates of the 
marginal posterior 

C. MCMC-ABC Samplers 

MCMC -based likelihood-free samplers were introduced to 
avoid the basic rejection sampling algorithms (see for example 
[30]) which are inefficient when the posterior and prior are 
sufficiently different [26]. Therefore, an MCMC approach that 
is based on the ABC methodology has been devised [26]. The 
generic MCMC-ABC sampler is presented in Algorithm 1. 

Algorithm 1 generic MCMC-ABC sampler 
Input: p{s) 

Output: {s["l} , samples from p(s|y) 
Initialise s^^l to an arbitrary staring point 
for n = 2,..., TV do 

Propose s* ~ g(s* ■<— s'""^') 

Simulate synthetic data from the model, H. ^ p 

if p(T(y),T(x))<ethen 

Compute the MCMC acceptance probability: 



(x|s* 



a (; 



S',Si' 



min < 1, 



P(s*) 



q(s 



["-11. 



9: 

10: 
11 
12: 
13: 
14: 
15: 
16: 



Generate u -- t/ [0, 1] 

if u< a(s*,s["-il) then 

s["l = s* 
else 

s["l = sl"^-^! 
end if 
else 

g[n] _ §["-1] 

end if 
end for 



p(s['»-il) (?(s' 



-£) 
-U) 



(g-) In the next section we present details of choices that must 

be made when constructing a likelihood-free inference model. 



IV. MCMC-ABC Based Detector 

We now relate the MCMC-ABC generic sampler to the 
specific problem presented in this paper As mentioned 
previously, the ABC method we consider embeds an in- 
tractable target posterior distribution, in our case denoted by 
p{si:K,hi;L,gi:L\y), into a general augmented model 



P isi:K, hl;L,gi:L, X|y) OC p (y|x, Sl:K, /ll:L, 51:l) 
P {x]Sl:K , hl.,L, gi:L) P {s1:k) P {hl:L) P (gV.L) , 



(13) 



where x is an auxiliary vector on the same space as y. In 
this paper we make the standard ABC assumption, for the 
weighting function, p{y\x,si:K,hi:L,gi:L) = p(y|x), see 
[21]. 

Hence, in the ABC context, we obtain a general ap- 
proximation to the intractable full posterior, denoted by 
p{si;K,hi:L,gi:L\y)- We are interested in the marginal tar- 
get posterior, p{si:K\y), which, in the ABC framework is 
approximated by 

P{si:K\y) 

■p {y\-X., Sl:K , hl:L, gi:L; (^) P {:X.\S1:K , hl:L, QI-.l) P (si:k) 

p {hi:L) p {gi-.h) dhdgdx 

N D 

- p («-^) E eK^'^'''"'' '^^^' '''■^' ^i"^' ') ^ (''^■^ K^i^ 

n=l d=l 
■-p{si:K\y), 

(14) 

where x'^f"' represents the d-th realisation at the n-th step 
of the Markov chain. In this paper we are interested in the 
marginal posterior ABC approximation p(si:/i-|y), as we wish 
to formulate the MAP detector for the symbols. 

Next we present the resulting MCMC-ABC algorithm to 
perform MAP detection of a sequence of transmitted symbols. 
In particular our MCMC sampler is comprised of a random 
scan Metropolis-Hastings (MH) within Gibbs sampler [31]. 
The Gibbs sampler is a special case of the MH algorithm. 
It is typically used when one wishes to update sub-blocks of 
the posterior parameters at each iteration of the Markov chain. 
This is especially beneficial when the full conditional posterior 
distributions for each sub-block can be expressed in closed 
form expressions and sampled. In this case the transition kernel 
for the Markov chain on the full set of posterior parameters 
becomes a product of the full conditional posterior densities, 
resulting in acceptance probability of one, see [32] . If however, 
one cannot sample easily from any of the conditional posterior 
densities, an MH algorithm is used to produce a sample. This 
algorithm is referred to as a Metropolis-Hastings within Gibbs 
sampler 

Algorithm 2 presents the details of the sampler, where we 
use the compact notation O = (S, G, H). In order to utilise the 
Metropolis-Hastings algorithm, we now specify the Markov 
chain transition kernels for S, G, H, otherwise known as the 
proposal distributions for the sampler. We design the sampler 
to update a single component of the posterior parameters, at 
every iteration of the algorithm, as it allows us to design 
proposals that insure a reasonable acceptance probability in the 
Markov chain rejection step. The transition kernels, denoted 
generically by q{9* ^ 6'["^^1), which updates the Markov 



Algorithm 2 MAP sequence detection algorithm using 

MCMC-ABC 

Initialize Markov chain state: 

1: Initialize n=l, S'^l 
2: for n =: 2,...,A^ do 

Propose new Markov chain state: O* given 9^""^!. 
3: Draw an index i ^ U [I, . . . ,K + 2L\ 
4: Draw proposal 



'P(S), 9^i}l =51:L, /il:L ^hl.,L 










from proposal distribu- 
tion q[6'' ^ t)^° "'). (Note the proposal will depend on 
which element of the 8 vector is being sampled.) 
ABC posterior (14): 

Generate auxiliary variables x[ , . . . ,Xp/ from the 
model, p(x(')|0*), for / == l,...,i, to obtain a 

realization of X = x = [x*^^\ . . . , x*^^^] by: 
(5.a) Sample W*''* ~ 6^1" (0, all) , i G {1, ■ • • , L}. 
(5.b) Sample V'''* ~ CDM" (0, all) , I £{!,■■■ ,L}. 
(5.C) Evaluate X*'' = f('> fs*/i(''* + W 



HO* 



,(0* 



6: 

7: 



+V«',iG{l,...,i}. 
Calculate a measure of distance p (T(y), T(x)) 
Evaluate the acceptance probability 



p(r|y,6„)g(r ^0'"-^i) 
p(6»I"-il|y,e„_i)g(6»I"-il->6l*) 

where pabc (^*|yj ^n), depending whether HD or SD 



a(e["-'',e*j =min<{l, 

where PABC (^*|y,' 
is used, is given by: 



HD: p(r|y,e„) 

(p(st:K)p(^l:L)p(3l:L), if P (T(y) , T(x)) < e„ 



0, 



SD: p(r|y,e„) 



otherwise; 



p(T(y),T(x))\ , . , ,,. , , . , 
« exp [ - ^^ ^%' ^ " ) p {sIk) p {hl,^) p {qIl) . 



Sample random variate u, where C/ ^ C/ [0, 1]. 
if u< a (©["-11,0*) then 
0N ^ 0* 

else 

©M = 0["-il. 

end if 
end for 



chain for the posterior parameters, 6, from the [n — l)-th 
Markov chain step to the n-th via the proposal q. In this 
case we decompose q into two components: the first, denoted 
by q{i), specifies the transition probability mass function for 
sampling a proposed element of the posterior parameter vector 
to be updated; and the second, denoted by q{6i\9^ J, 
specifies the probability density for proposing a new Markov 
chain state for element 6i conditional on its previous state at 
iteration n — 1 . The specific transition kernels we specified for 
our algorithm are given by: 

• Transition kernel for Si-k'- draw a proposal SJ.^ from 



distribution 



1 1 



i](5„[„-i] 



(15) 



KlOg2M ^l:i-l ^^+1:K 



Transition kernel for G^: draw proposal G* from distri- 
bution 



<i(s ^ g; 



<iii)i{g?^\st'^] 



fe^ g. 



(16) 



• Transition kernel for Hi: draw proposal H* from distri- 
bution 

g(h* ^hf-^l) =q(/)g(ft«jhr^l) = ieK(h["-^UU 

(17) 

where S^ denotes a dirac mass on location (j), and q{i) and 
q{l) respectively denote the uniform probabilities of choosing 
indices i e {1, . . . ,K} and I e {1, . . . , L}, and al_^^, al_^^ 
represent the transition kernels variance. Thus, in every iter- 
ation we update a single component of the symbols s!,"]^, a 
single component of the transmitter-relay channels g^" , and a 

fnl 

single component of the relay-receiver channels h.\ . 

We now present details about the choices made for the ABC 
components of the likelihood-free methodology. 

A. Observations and synthetic data 

The data y = yi-^ correspond to the observed sequence of 
symbols at the receiver The generation of the synthetic data 
in the likelihood-free approach involves generating auxiliary 
variables x\\. . . , x^J from the model, p (x^'^ |s, g, h), for I = 

1, . . . , L, to obtain a realization x = [x(^\ . . . , x'^^] . This 
is achieved under the following steps: 

a. Sample W^)* - 6:^(0, cr^l)- 

b. Sample VC)* - 6:^(0,(721^ 

c. Evaluate the system model in (2), 

f(0 (^s.;,(0. + w(')*) p«* + V^'^J G {1, . . . , L} . 



X 



(0 



B. Summary statistics 

As discussed in Section III-A, summary statistics T(-) are 
used in the comparison between the synthetic data and the 
observed data via the weighting function. There are many 
possible choices of summary statistic. Most critically, we want 
the summary statistics to be as close to sufficient statistics as 
possible whilst low in dimension. The simplest choice is to 
use T (y) = y i.e. the complete dataset. This is optimal in 
the sense that it does not result in a loss of information from 
summarising the observations. The reason this choice is rarely 
used is that typically it will result in poor performance of 
the MCMC-ABC as discussed previously. To understand this, 
consider the HD rule weighting function in (6). In this case, 
even if the true MAP estimated model parameters were utilized 
to generate the synthetic data x, it will still become improbable 
to realise a non-zero weight as e — > 0. This is made worse 
as the number of observations increases, through the curse 
of dimensionality. As a result, the acceptance probability in 
the MCMC-ABC algorithm would be zero for long periods. 



resulting in poor sampler efficiency. See [25] for a related 
discussion on chain mixing. We note that the exception to 
this rule is when a moderate tolerance e and small number of 
observations are used. 

A popular and practical alternative to utilizing the entire 
data set, is to use empirical quantile estimates of the data 
distribution. Here we adopt the vector of quantiles T (y) = 
[$0.1 (y) , • • • , ?o.9 (y)], where q^ (y) denotes the a-level em- 
pirical quantile of y, see for example [20], [24]. 

C. Distance metric 

For the distance metric p (T(y), T(x)), as a component of 
the weighting function, we use the Mahalanobis distance 

p (T (y) , T (x)) = [T (y) - T (x)]^ ^~' [T (y) - T (x)] , 

where Ex is the covariance matrix of T(x). In Section VII we 
contrast this distance metric with scaled Euclidean distance, 
obtained by substituting diag (Sx) for Sx in the above. 

We estimate Ex as the sample covariance matrix of T(x), 
based on 2, 000 likelihood draws from p I y|si:_ff, ft.i:L, gi:L ) 
where si-k is a mode of the prior for the symbols and 
the channel coefficients are replaced with the partial CSI 
estimates. 

We note that in principal, the choice of matrix Ex is 
immaterial, in the sense that in the limit as e — > 0, then 
p{si:K\y) — > p{si:K\y) assuming sufficient statistics T(x). 
However, in practice, algorithm efficiency is directly affected 
by the choice of Ex. We demonstrate this in Section VII. 



D. Weighting function 

For weighting function p{y\x,Si:K,hi:L,gi:L) = p(y|x) 
we consider the HD weighting function in (6) and the SD 
weighting function in (7). 



E. Tolerance schedule 

With a HD weighting function, an MCMC-ABC algorithm 
can experience low acceptance probabilities for extended pe- 
riods, particularly when the chain explores the tails of the 
posterior distribution (this is known as "sticking" c.f. [25]). In 
order to achieve improved chain convergence, we implement 
an annealed tolerance schedule during Markov chain burn-in, 
through e„ = max { A^ — lOn, e™'"}, where e„ is the tolerance 
at time n in the Markov chain, N is the overall number of 
Markov chain samples, and e™'" denotes the target tolerance 
of the sampler. 

There is a trade-off between computational overheads (i.e. 
Markov chain acceptance rate) and the accuracy of the ABC 
posterior distribution relative to the true posterior. In this 
paper we determine e™'" via preliminary analysis of the 
Markov chain sampler mixing rates for a transition kernel with 
coefficient of variation set to one. In general practitioners will 
have a required precision in posterior estimates. This precision 
can be directly used to determine, for a given computational 
budget, a suitable tolerance e™'". 



F. Performance diagnostic 

Given the above mixing issues, one should carefully monitor 
convergence diagnostics of the resulting Markov chain for a 
given tolerance schedule. For a Markov chain of length A^, 
the performance diagnostic we consider is the autocorrelation 
evaluated on N ~ N ~ Nb post-convergence samples after 
an initial burn-in period Nt,. Denoting by {O" }„^i.ff the 
Markov chain of the i-th parameter after burn-in, we define 
the autocorrelation estimate at lag r by 



Algorithm 3 MAP sequence detection algorithm using AV- 

MCMC 

Initialize Markov chain state: 

1: Initialize n=l, S^^l ^ P (s), <7i.^ = di-.L, hi.j^ — hi-L, 

W(o) - p (w) 
2: for n = 1 , . . . , A^ do 

Propose new Markov chain state: 6* given 6["~^1. 
3: Draw an index i ^ U\l, ...,K + 2L + KL] 






n[«-l] 
i+l:K+2L+KL 



from 



ACFiO^T) 



1 



N-T 



(N - t)<t {e,) ^^ 



(18) 

where IliOi) and a {6i) are the estimated mean and standard 
deviation of 9i. 

V. Auxiliary variable MCMC approach 

In this section we demonstrate an alternative solution to 
the previously presented MCMC- ABC detector We will show 
that at the expense of increasing the dimension of the pa- 
rameter vector, one can develop a standard MCMC algorithm 
without the requirement of the ABC methodology. We aug- 
ment the parameter vector with the unknown noise realiza- 
tions at each relay, y^i-K, to obtain a new parameter vector 
{sl■,K^gl■.L^hl■.L^'^l■.K)■ The rcsuMng posterior distribution 
p (si:x,5i:L, ^i:L, wi:;4:|y) may then be decomposed into the 
full conditional distributions: 

p{S'i_:K\gi:L,hl:L,'Wx:K,y) CC P {y\si:K , Ql-.L ,h-l:L ,'W-1:k) P {^i-.k) , 

(19a) 

P {gi:L\si:K , hl:L ,Wl:K ,y) OC p (y\si:K , Ql-.L , hl-.L , Wi-.k) P {QI-.l) , 

(19b) 

p{hl:L\si:K,gi:L,Wl:K,y) OC p (y\si:K , Ql-.L , hl-.L , Wi-.k) P {Iii-.l) , 

(19c) 

p{-Wl:K\si:K,gi:L,hl:L,y) <X p {y\si:K , Ql-.L , hl-.L , Wi-.k) P {wi-.k) , 

(19d) 

which form a block Gibbs sampling framework. Conditioning 
on the unknown noise random variables at the relays permits 
a simple closed form solution for the likelihood and results 
in tractable full conditional posterior distributions for (19a)- 
(19d). In this case the likelihood is given by 

L 

p {y\si.,K, gi:L, /ii:L, wi;k) ^JJp (y*'' \si:K, g'-^\ h'-^\ vf'-^'^ 



4: Draw proposal O* = 
proposal distribution 

(Note, the proposal will depend on which element of 
the O vector is being sampled.) 
5: Evaluate the acceptance probability 



fQln-l]^Q*\ =minJ 1,- 



p{e*\y)q{e*^ei^' 



(6>["-ll|y)g(6»["-ll -^9*) 



Sample random variate u, where U ^ U [0, 1]. 
if u< a(0["-il,©*) then 
0N = 0* 

else 

©N — ©[""11. 
end if 
end for 



where 

p(y^'^\si..K,g^'\h^'\w('A=ey^(f^'Ush^'^+W^'Ag^'\a^a 



The resulting Metropolis-within Gibbs sampler for this block 
Gibbs framework is outlined in Algorithm 3, where we define 
the joint posterior parameter vector Q = (S, G, H, W). The 
Metropolis-Hastings proposals used to sample from each full 
conditional distribution were given by eqs. (15)-(17), and the 
additional proposal for the auxiliary variables, given by 
• Transition kernel for W^ : draw proposal W* from 
distribution 

1 ___/ ._„ . X (20) 



KL 






The MCMC-AV approach presents an alternative to the 
likelihood-free Bayesian model sampler, and produces exact 
samples from the true posterior following chain convergence. 
While the MCMC-AV sampler still performs joint channel 
estimation and detection, the trade-off is that sampling the 
large number of extra parameters will typically result the need 
for longer Markov chains, to achieve the same performance as 
the ABC algorithm (in terms of joint estimation and detection 
performance). This is especially true in high dimensional 
problems, such as when the sequence of transmitted symbols 
K is long and the number of relays L present in the system 
is large, or when the posterior distribution of the additional 
auxiliary variables exhibits strong dependence. 

VI. Alternative MAP detectors and lower bounds 

One can define a suboptimal solution to the MAP detector, 
even with an intractable likelihood, involving a naive, highly 
computational algorithm based on a Zero Forcing (ZF) so- 
lution. The ZF solution is popular in simple system models 
where it can be efficient and performs well. 

Under a ZF solution one conditions on some knowledge 
of the partial channel state information, and then perform an 
explicit search over the set of all possible symbol sequences. 
To our knowledge a ZF solution for MAP sequence detection 
in arbitrary non-linear relay systems has not been defined. 
Accordingly we define the ZF solution for MAP sequence 
detection as the solution which conditions on the mean of 
the noise at the relay nodes, and also uses the noisy channel 
estimates given by the partial CSI information, to reduce the 
detection search space. 



A. Sub-optimal exhaustive search Zero Forcing approach 

In this approach we condition on the mean of the noise 
W^') = 0, and use the partial CSI estimates of channels 

coefficients, ih^^\g'^^^ > , to reduce the dimensionality of 

the MAP detector search space to just the symbol space Q. 
The SES-ZF-MAP sequence detector can be expressed as 



sen ;^^ V 

L 

sea l\ V ^ 

(21) 

Thus, the likelihood model results in a complex Gaussian 
distribution for each relay channel, as follows 



P 



y(')|s,G«-g('),i7(')=/i«,W«=0 



e:N-ff(')fs/iW)gW,a2l 



(22) 



As a result, the MAP detection can be solved exactly using 
an explicit search. 

Note however that this approach to symbol detection also 
involves a very high computational cost, as one must evaluate 
the posterior distribution for all M^ code words in 51. It is 
usual for communications systems to utilise M as either 64- 
ary PAM or 128-ary PAM and the number of symbols can 
be anything from K = 1 to K = 20 depending on the 
channel capacity budget for the designed network and the 
typical operating SNR level. Typically this explicit search is 
not feasible to perform. However, the sub-optimal ZF-MAP 
detector provides a comparison for the MCMC-ABC approach, 
which at low SNR should be a reasonable upper bound for the 
SER and for high SNR an approximate optimal solution. 

The SES-ZF-MAP sequence detector can be highly sub- 
optimal for low SNR values. This is trivial to see, since 
we are explicitly setting the noise realisations to zero when 
the variance the noise distribution is large. For the same 
reasoning, in high SNR values, the ZF approach becomes close 
to optimal. 



B. Lower bound MAP detector performance 

We denote the theoretical lower bound for the MAP detector 
performance as the oracle MAP detector (OMAP). The OMAP 
detector involves conditioning on perfect oracular knowledge 

of the channels coefficients j/i^'-*, c;*^') }. and of the realized 

noise sequence at each relay W*^'). This results in the likeli- 
hood model for each relay channel being complex Gaussian, 
resulting in an explicit solution for the MAP detector Ac- 
cordingly, the OMAP detector provides the lower bound for 
the SER performance. The OMAP detector is expressed as 



sen 



s — argmax \\p (s|y , IIj = argmax 1 [ p ( y' '|s, II) p(s 



es2 



sian distribution for each relay channel, as follows 



(23) 



However, clearly this is impossible to evaluate in a real system, 
since oracular knowledge is not available. 

VII. Results 

We compare the performance of joint channel estimation 
and detection under the ABC relay methodology versus the 
auxiliary MCMC approach for different model configurations. 
Additionally, we compare the detection performance of the 
ABC relay methodology, the auxiliary MCMC approach, the 
optimal oracle MAP sequence detector and the SES-ZF MAP 
sequence detector. Each presented Monte Carlo sampler has 
a burn-in of 5,000 iterations followed by a further 15,000 
recorded iterations (N — 20,000). Random walk proposal 
variances for each of the parameters were tuned off-line to 
ensure the average acceptance probability (post burn-in) was 
in the range 0.3 to 0.5. 

The following specifications for the relay system model are 
used for all the simulations performed: the symbols are taken 
from a constellation which is 4-PAM at constellation points in 
{—3,-1,1,3}; each sequence contains K ~ 2 symbols; the 
prior for the sequence of symbols is p{{si,S2) = [1,1]) ~ 
p{{si,S2) = [—1,1]) = 0.3 otherwise, all other are equi- 



probable; the partial CSI is given by cr 



= 0.1; the 



where n:= (G^'' =g^^\H^^^ =/i('),W(') = W^'). 

In this case, the likelihood model results in a complex Gaus- 



nonlinear relay function is given by / (■) = tanh(-). These 
system parameters were utilised as they allow us to perform 
the zero forcing solution, without a prohibitive computational 
burden. 



A. Analysis of mixing and convergence of MCMC-ABC 

We analyze the impact that the ABC tolerance level e™'" 
has on estimation performance of channel coefficients and 
the mixing properties of the Markov chain for the MCMC- 
ABC algorithm. The study involves joint estimation of channel 
coefficients and transmitted symbols at an SNR level of 15dB, 
with L = 5 relays present in the system. We adopt a scaled 
Euclidean distance metric with the HD weighting function 
(6) and empirically monitor the mixing of the Markov chain 
through the autocorrelation function (ACF). 

In Fig. 2 we present a study of the ACF of the Markov 
chains for the channel estimations of Gi and Hi as a function 
of the tolerance e, and the associated estimated marginal 
posterior distributions p{gi\Y) and p{hi\Y). For large e the 
Markov chain mixes over the posterior support efficiently, 
since when the tolerance is large, the HD weighting function 
and therefore the acceptance probability, will regularly be 
non-zero. In addition, with a large tolerance the posterior 
almost exactly recovers the prior given by the partial CSI. 
This is expected since a large tolerance results in a weak 
contribution from the likelihood. As the tolerance e decreases, 
the posterior distribution precision increases and there is a 
translation from the prior partial CSI channel estimates to 
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Fig. 2. Comparison of performance for MCMC-ABC with Hard 
Decision weighting and Scaled Euclidean distance metric. Subplots 
on the left of the image display how the estimated ACF changes 
as a function of tolerance level e for the estimated channels of the 
relay system. Subplots on the right of the image display a sequence 
of smoothed marginal posterior distribution estimates for the first 
channel of the relay, as the tolerance decreases. Note, the indexing 
of each marginal distribution with labels 1,2,... conesponds to the 
tolerance given in the legend on the left hand plots. 



the posterior distribution over the true generated channel 
coefficients for the given frame of data communications. 

It is evident that the mixing properties of the MCMC- 
ABC algorithm are impacted by the choice of tolerance level. 
A decreasing tolerance results in more accurate posterior 
distribution, albeit at the price of slower chain mixing. Clearly, 
the ACF tail decay rate is significantly slower as the tolerance 
reduces. 

Note, that although the results are not presented here, we 
also performed analysis for all aspects of Algorithm 1 under 
the setting in which the relay function is linear. We confirmed 
the MMSE estimates of the channel coefficients and the 
MAP sequence detector results were accurate for a range of 
SNR values. The results presented here are for a much more 
challenging setting in which the relay is highly non-linear, 
given by a hyperbolic tangent function. 

B. Analysis of ABC model specifications 

We now examine the effect of the distance metric and 
weighting function on the performance of the MCMC-ABC 
algorithm as a function of the tolerance. We consider HD and 
SD weighting functions with both Mahalanobis and scaled 
Euclidean distance metrics. We consider the estimated ACF 
of the Markov chains of each of the channel coefficients G 
and H. The SNR level was set to 15dB and L = 5 relays were 
present. The results are presented for one channel; since all 
channels are i.i.d. this will be indicative of the performance 
of all channels. 

For comparison, an equivalent ABC posterior precision 
should be obtained under each algorithm. Since, the weighting 
and distance functions are different, this will result in different 



— SD Mahalanobis 
HD Mahalanobis 

— HD scaled Euclidean 
- SD scaled Euclidean 



Fig. 3. Maximum distance between the EDF and the baseline "true" 
EDF for the first channel, estimated cdf for Gi, averaged over 20 
independent data realisations. 



e""" values for each choice in the MCMC-ABC algorithm. 
As a result, analysis proceeded by first taking a minimum 
base epsilon value, Cf, = 0.2 and running the MCMC- 
ABC with soft decision Gaussian weighting and Mahalanobis 
distance for 100, 000 simulations. We ensured that the average 
acceptance rate was between [0.1, 0.3]. This produced a "true" 
empirical distribution function (EDF) which we used as our 
baseline comparison estimate of the true cdf. Now, for a 
range of tolerance values e,; = [0.25,0.5,0.75, 1] we ran the 
MCMC-ABC algorithm with soft decision Gaussian weighting 
and Mahalanobis distance for 20,000 iterations, ensuring the 
average acceptance rate was in the interval [0.1,0.3]. This 
produced a set of random walk standard deviation (Ji,(i), 
one for each tolerance, that we used for the analysis in 
the remaining choices of the MCMC-ABC algorithms. For 
comparison purposes we recorded the estimated maximum 
error between the estimated EDF for each algorithm and the 
baseline "true" EDF. We repeated the simulations for each 
tolerance on 20 independent generated data sets. 

In Fig. 3 we present the results of this analysis, which 
demonstrate that the algorithm producing the most accurate 
results utilised the soft decision and Mahalanobis distance. 
The worst performance involved the hard decision and scaled 
Euclidean distance. In this case, at low tolerances the aver- 
age distance between the EDF and the baseline "true" EDF 
was a maximum, since the algorithm was not mixing. This 
demonstrates that such low tolerances under this setting of 
the MCMC-ABC algorithm will produce poor performance, 
relative to the soft decision with Mahalanobis distance. 



C. Comparisons of detector performance 

Finally, we present an analysis of the symbol error rate 
(SER) under the MCMC-ABC algorithm (with a SD weighting 
function and Mahalanobis distance), the MCMC-AV detector 
algorithm, and the SES-ZF and Oracle detectors. Specifi- 
cally, we systematically study the SER as a function of 
the number of relays, L e {1,2,5,10} and the SNR G 
{0,5,10,15,20,25,30}. 
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Fig. 4. SER performance of each of the proposed detector schemes 
as a function of the number of relay links, L. For each relay set up 
the SER is reported as a function of the SNR. 



The results of this analysis are presented in Fig. 4. In sum- 
mary these results demonstrate that under our proposed system 
model and detection algorithms, spatial diversity stemming 
from an increasing number of relays results in measurable 
improvements in the SER performance. For example Fig. 
4 demonstrates that for L = 1, there is an insignificant 
difference between the results obtained for algorithms MCMC- 
ABC, MCMC-AV and SES-ZF However, as L increases, SES- 
ZF has the worst performance and degrades relative to the 
MCMC-based approaches, demonstrating the utility obtained 
by developing a more sophisticated detector algorithm. It is 
clear that the SES-ZF suffers from an error-floor effect: as 
the SNR increases the SER is almost constant for SNR values 
above 15dB. 

Also in Fig. 4 the two MCMC-based approaches demon- 
strate comparable performance for small L. However as L 
increases, in the high SNR region, the difference in perfor- 
mance between the MCMC-AV and MCMC-ABC algorithms 
increases. This could be due to the greater numbers of auxil- 
iary parameters to be estimated in the auxiliary-based approach 
as L increases. In particular we note that adding an additional 
relay introduces K additional nuisance parameters into the 
auxiliary model posterior. 

VIII. Conclusions 

In this paper, we proposed a novel cooperative relay system 
model and then obtained novel detector algorithms. In particu- 
lar, this involved an approximated-MAP sequence detector for 
a coherent multiple relay system, where the relay processing 
function is non-linear. Using the ABC methodology we were 
able to perform "likelihood free" inference on the parameters 
of interest. Simulation results validate the effectiveness of the 
proposed scheme. In addition to the ABC approach, we devel- 
oped an alternative exact novel algorithm, MCMC-AV, based 
on auxiliary variables. Finally, we developed a sub-optimal 
zero forcing solution. We then studied the performance of each 



algorithm under different settings of our relay system model, 
including the size of the network and the noise level present. 
As a result of our findings, we recommend the use of the 
MCMC-ABC detector especially when there are many relays 
present at the network, or the number of symbols transmitted 
in each frame is large. In settings where the number of relays is 
moderate, one could consider using the MCMC-AV algorithm, 
as its performance was on par with the MCMC-ABC results 
and does not involve an ABC approximation. 

Future research includes the design of detection algorithms 
for relay systems with partial CSI in which the relay system 
topology may contain multiple hops on a given channel, or 
the relay network topology may be unknown. This can include 
aspects such as an unknown number of relay channels or hops 
per relay channel. 
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